This explains how to do it: https://forum.arduino.cc/index.php?topic=610354.0
Fortunately, Optiboot 8 recently added a new function do_spm() which, from what I gather, you call from your code. It then takes the new compiled sketch binary that your OTA code is assumed to have placed at the end of flash, writes it to the beginning of flash, and then jumps execution to the start of the freshly written sketch. In a nutshell, it appears that's all there is to it.
Prior to this, you had to write your own bootloader. It appears that now you can get by without having to. Instead, the heavy lifting is moved into your application code, where development is more familiar. That may not be as robust as having an unchanging, tried-and-true FOTA bootloader that's protected, but it does seem like an faster path to get a FOTA for a particular radio up and running across a range of MCU's without having to make a custom bootloader for each one.
What a relief! I was on the verge of switching over to PICs because of this exact issue. Hopefully now I won't have to.